Skip to content

Conversation

@kaffarell
Copy link

Previously the space character was exclusively encoded to '+'. This is wrong, as the URL Standard 0 specifies that the default is '%20'. Another function has been introduced as well, which replicates the old behavior and converts spaces to '+'.
Notice that this breaks the default behavior and could lead to bugs.

Fixes: #927
Fixes: #888

@kaffarell
Copy link
Author

@valenting let me know what you think!

Previously the space character was exclusively encoded to '+'. This is
wrong, as the URL Standard [0] specifies that the default is '%20'.
Another function has been introduced as well, which replicates the old
behavior and converts spaces to '+'.
Notice that this breaks the default behavior and could lead to bugs.

[0]: https://url.spec.whatwg.org/#string-percent-encode-after-encoding

Fixes: servo#927
Fixes: servo#888

Signed-off-by: Gabriel Goller <[email protected]>
@Manishearth
Copy link
Member

I am wary about breaking upstream users, especially the browser users that use this crate (who will likely care about nitty gritty url spec behavior)

@valenting @mrobinson can you check whether the Gecko/Servo callsites are all ones that need spaceIsPlus by default?

We can invert which behavior is the default here to solve this if needed.

@codecov
Copy link

codecov bot commented Apr 30, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.

Please upload report for BASE (main@4b9f1e6). Learn more about missing BASE report.

Files with missing lines Patch % Lines
form_urlencoded/src/lib.rs 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #928   +/-   ##
=======================================
  Coverage        ?   80.03%           
=======================================
  Files           ?       24           
  Lines           ?     4288           
  Branches        ?        0           
=======================================
  Hits            ?     3432           
  Misses          ?      856           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@valenting
Copy link
Collaborator

I am wary about breaking upstream users, especially the browser users that use this crate (who will likely care about nitty gritty url spec behavior)

@valenting @mrobinson can you check whether the Gecko/Servo callsites are all ones that need spaceIsPlus by default?

We can invert which behavior is the default here to solve this if needed.

As far as I can tell servo doesn't use this method https://github.com/search?q=repo%3Aservo%2Fservo+form_urlencoded&type=code

Nor does Gecko https://searchfox.org/mozilla-central/search?q=byte_serialize&path=&case=false&regexp=false

This will be a breaking change anyway.

@Manishearth
Copy link
Member

Since this crate is used a lot should we implement this in a way that is non breaking (with reversed defaults, strongly documented), and perhaps file an issue for a potential 3.0 where we start listing breaking changes?

I suspect 99% of users will not care about this anyway. I feel like doing a 3.0 just for this is a bit much.

Fixed link to url spec.

Signed-off-by: Gabriel Goller <[email protected]>
@kaffarell
Copy link
Author

I am wary about breaking upstream users, especially the browser users that use this crate (who will likely care about nitty gritty url spec behavior)

This is a breaking change, although a very small one. As you said it will probably only affect those who care about the url spec, which in turn, should know this case already!

Since this crate is used a lot should we implement this in a way that is non breaking (with reversed defaults, strongly documented), and perhaps file an issue for a potential 3.0 where we start listing breaking changes?

I have to disagree with the reversed defaults. It's better if we do a 3.0 release and keep it this way, otherwise we will not be 'url-spec compliant'. The url-spec clearly states that the default-behavior should be '%20' and not '+'.

IMO a 3.0 release (or even a minor release if we're feeling adventurous :) ) is the way to go!

@gibfahn
Copy link

gibfahn commented Jun 21, 2024

It would be nice to have an opt-in way to do this at least (which wouldn't have to be a breaking change hopefully), that way users who need this encoding can still use this crate while plans for a breaking change are worked out.

If nothing else this will probably break a bunch of tests.

kaffarell added a commit to kaffarell/rust-url that referenced this pull request Feb 17, 2025
Per default the space character is exclusively encoded to '+'. This is
wrong, as the URL Standard [0] specifies that the default is '%20'.
PR servo#928 fixes this behavior, but is obviously a breaking change. To
introduce this feature early, add a new function that sets the correct
behavior. This way, we can use it without causing a breaking change.

[0]: https://url.spec.whatwg.org/#string-percent-encode-after-encoding

Fixes: servo#927
Fixes: servo#888

Signed-off-by: Gabriel Goller <[email protected]>
@kaffarell
Copy link
Author

@gibfahn that's a good idea! Submitted a new PR that we can merge without a breaking change #1028.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

form_urlencoded::ByteSerialize does NOT conform to the URL standard Add parameter spaceAsPlus to ByteSerialize

4 participants